Multi-phase array redistribution: modeling and evaluation
نویسندگان
چکیده
s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 Table 1: Execution times (ms) for cyclic(s) to cyclic(t) redistribution on 32 processors. other block sizes t. Fig. 3 shows the total times in milliseconds for a cyclic(192) to cyclic(8) redistribution on 32 processors for increasing data sizes. This redistribution corresponds to the cyclic(Y t) to cyclic(t) case with Y = 24. The two-phase redistribution is: cyclic(192) ! cyclic(48) ! cyclic(8). The two-phase strategy performs better than the single-phase strategy up to a data size of approximately 90K. Note that the cross over between the single-phase and two-phase redistribution occurs at a lower data size for the case when Y = 24 than for the case when Y = 30. This behavior can be explained by noting that the reduction in the number of message startups with the two-phase redistribution is greater for Y = 30. Similar patterns were observed for the cyclic(s) to cyclic(Y s) redistribution. We now evaluate the effect of choice of common multiples and divisors on the general cyclic(s) to cyclic(t) redistribution. Table. 1 shows the two-stage redistribution times for various source and target block sizes. The redistribution times for various multiples and divisors are presented. It can be observed that the two-stage redistributions using the lcm and the gcd as intermediate block sizes have nearly equal execution times and perform better than the two-stage redistributions using a common multiple greater than the lcm and a common divisor smaller than the gcd. 7 Conclusion We have presented a multi-phase approach for performing communication efficient data redistribution for block-cyclically distributed arrays. We have developed precise closed form expressions for the send and receive processor and data index sets for two special cases where the block size of the source block-cyclic distribution is a multiple of the block size of the target block-cyclic distribution and vice versa. These closed forms facilitate the development of a distributed scheduling algorithm for performing the all-to-many personalized communication, and the devel-opement of a communication cost model for array redistribution. Based on this model, we demonstrate that the use of the multi-phase redistribution strategy can reduce the total cost for array redistribution. Performance results on the Cray T3D show that the multi-phase strategy can improve performance over the single phase strategy for array redistribution. The multi-phase strategy is being extended to handle array redistribution of multi-dimensional arrays. …
منابع مشابه
Multi-Phase Redistribution: A Communication-Efficient Approach to Array Redistributionz
Distributed-memory implementations of several scientific applications require array redistribution. Array redistribution is used in languages such as High Performance Fortran to dynamically change the distribution of arrays across processors. Performing array redistribution incurs two overheads an indexing overhead for determining the set of processors to communicate with and the array elements...
متن کاملAutomated Analysis of Pressure Build up Tests Affected by Phase Redistribution
Analytical Solutions and type curves for the constant rate radial flow of fluid in both conventional and naturally fractured reservoirs including the effect of wellbore phase redistribution are presented. An automated procedure for non-linear least square minimization using the analytical solutions and their derivatives with respect to the unknown parameters developed to analyze the pressur...
متن کاملA Multi-attribute Reverse Auction Framework Under Uncertainty to the Procurement of Relief Items
One of the main activities of humanitarian logistics is to provide relief items for survivors in case of a disaster. To facilitate the procurement operation, this paper proposes a bidding framework for supplier selection and optimal allocation of relief items. The proposed auction process is divided into the announcement construction, bid construction and bid evaluation phases. In the announcem...
متن کاملModeling and Performance Evaluation of Multi-Processors Organization with Shared Memories
This paper is primarily concerned with theoretical evaluation of the performance of multiprocessors system. A markovian waiting line model has been developed for various different multi-processors configurations, with shared memory. The system is analysed at the request level rather than job level.
متن کاملMessage Encoding Techniques for Efficient Arrary Redistribution
In this paper, we present message encoding techniques to improve the performance of BLOCK-CYCLlC(kr) to BLOCK-CYCLIC(r) {and vice versa) array ’ redistribution algorithms. The message encoding techniques are machine independent and could be used with different algorithms. By incorporating the techniques in array redistribution algorithms, one can reduce the computation overheads and improve the...
متن کامل